Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
Nat Biotechnol ; 2023 Apr 20.
Article in English | MEDLINE | ID: covidwho-2294540

ABSTRACT

Current methods for inference of phylogenetic trees require running complex pipelines at substantial computational and labor costs, with additional constraints in sequencing coverage, assembly and annotation quality, especially for large datasets. To overcome these challenges, we present Read2Tree, which directly processes raw sequencing reads into groups of corresponding genes and bypasses traditional steps in phylogeny inference, such as genome assembly, annotation and all-versus-all sequence comparisons, while retaining accuracy. In a benchmark encompassing a broad variety of datasets, Read2Tree is 10-100 times faster than assembly-based approaches and in most cases more accurate-the exception being when sequencing coverage is high and reference species very distant. Here, to illustrate the broad applicability of the tool, we reconstruct a yeast tree of life of 435 species spanning 590 million years of evolution. We also apply Read2Tree to >10,000 Coronaviridae samples, accurately classifying highly diverse animal samples and near-identical severe acute respiratory syndrome coronavirus 2 sequences on a single tree. The speed, accuracy and versatility of Read2Tree enable comparative genomics at scale.

2.
Gigascience ; 112022 05 28.
Article in English | MEDLINE | ID: covidwho-1873910

ABSTRACT

BACKGROUND: The Syrian hamster (Mesocricetus auratus) has been suggested as a useful mammalian model for a variety of diseases and infections, including infection with respiratory viruses such as SARS-CoV-2. The MesAur1.0 genome assembly was generated in 2013 using whole-genome shotgun sequencing with short-read sequence data. Current more advanced sequencing technologies and assembly methods now permit the generation of near-complete genome assemblies with higher quality and greater continuity. FINDINGS: Here, we report an improved assembly of the M. auratus genome (BCM_Maur_2.0) using Oxford Nanopore Technologies long-read sequencing to produce a chromosome-scale assembly. The total length of the new assembly is 2.46 Gb, similar to the 2.50-Gb length of a previous assembly of this genome, MesAur1.0. BCM_Maur_2.0 exhibits significantly improved continuity, with a scaffold N50 that is 6.7 times greater than MesAur1.0. Furthermore, 21,616 protein-coding genes and 10,459 noncoding genes are annotated in BCM_Maur_2.0 compared to 20,495 protein-coding genes and 4,168 noncoding genes in MesAur1.0. This new assembly also improves the unresolved regions as measured by nucleotide ambiguities, where ∼17.11% of bases in MesAur1.0 were unresolved compared to BCM_Maur_2.0, in which the number of unresolved bases is reduced to 3.00%. CONCLUSIONS: Access to a more complete reference genome with improved accuracy and continuity will facilitate more detailed, comprehensive, and meaningful research results for a wide variety of future studies using Syrian hamsters as models.


Subject(s)
Chromosomes, Mammalian , Mesocricetus , Animals , Chromosomes, Mammalian/genetics , Genome , High-Throughput Nucleotide Sequencing/methods , Mesocricetus/genetics , Whole Genome Sequencing
3.
Nat Commun ; 13(1): 1321, 2022 03 14.
Article in English | MEDLINE | ID: covidwho-1740438

ABSTRACT

Infectious disease monitoring on Oxford Nanopore Technologies (ONT) platforms offers rapid turnaround times and low cost. Tracking low frequency intra-host variants provides important insights with respect to elucidating within-host viral population dynamics and transmission. However, given the higher error rate of ONT, accurate identification of intra-host variants with low allele frequencies remains an open challenge with no viable computational solutions available. In response to this need, we present Variabel, a novel approach and first method designed for rescuing low frequency intra-host variants from ONT data alone. We evaluate Variabel on both synthetic data (SARS-CoV-2) and patient derived datasets (Ebola virus, norovirus, SARS-CoV-2); our results show that Variabel can accurately identify low frequency variants below 0.5 allele frequency, outperforming existing state-of-the-art ONT variant callers for this task. Variabel is open-source and available for download at: www.gitlab.com/treangenlab/variabel .


Subject(s)
COVID-19 , Nanopore Sequencing , Nanopores , High-Throughput Nucleotide Sequencing/methods , Humans , SARS-CoV-2/genetics
4.
PLoS One ; 16(8): e0244468, 2021.
Article in English | MEDLINE | ID: covidwho-1371999

ABSTRACT

The newly emerged and rapidly spreading SARS-CoV-2 causes coronavirus disease 2019 (COVID-19). To facilitate a deeper understanding of the viral biology we developed a capture sequencing methodology to generate SARS-CoV-2 genomic and transcriptome sequences from infected patients. We utilized an oligonucleotide probe-set representing the full-length genome to obtain both genomic and transcriptome (subgenomic open reading frames [ORFs]) sequences from 45 SARS-CoV-2 clinical samples with varying viral titers. For samples with higher viral loads (cycle threshold value under 33, based on the CDC qPCR assay) complete genomes were generated. Analysis of junction reads revealed regions of differential transcriptional activity among samples. Mixed allelic frequencies along the 20kb ORF1ab gene in one sample, suggested the presence of a defective viral RNA species subpopulation maintained in mixture with functional RNA in one sample. The associated workflow is straightforward, and hybridization-based capture offers an effective and scalable approach for sequencing SARS-CoV-2 from patient samples.


Subject(s)
COVID-19/pathology , SARS-CoV-2/genetics , Sequence Analysis, DNA/methods , COVID-19/virology , DNA, Complementary/chemistry , DNA, Complementary/metabolism , Gene Frequency , Genetic Variation , Genome, Viral , Humans , Open Reading Frames/genetics , RNA, Viral/genetics , RNA, Viral/metabolism , Real-Time Polymerase Chain Reaction , SARS-CoV-2/isolation & purification , Viral Load
5.
Genome Res ; 31(4): 635-644, 2021 04.
Article in English | MEDLINE | ID: covidwho-1145214

ABSTRACT

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Subject(s)
COVID-19/diagnosis , COVID-19/transmission , Genetic Variation , Genome, Viral , Real-Time Polymerase Chain Reaction/methods , SARS-CoV-2/genetics , COVID-19/virology , Host-Pathogen Interactions , Humans , Polymorphism, Single Nucleotide
6.
Nat Commun ; 12(1): 1660, 2021 03 12.
Article in English | MEDLINE | ID: covidwho-1132065

ABSTRACT

In less than nine months, the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) killed over a million people, including >25,000 in New York City (NYC) alone. The COVID-19 pandemic caused by SARS-CoV-2 highlights clinical needs to detect infection, track strain evolution, and identify biomarkers of disease course. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs and a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, viral, and microbial profiling. We applied these methods to clinical specimens gathered from 669 patients in New York City during the first two months of the outbreak, yielding a broad molecular portrait of the emerging COVID-19 disease. We find significant enrichment of a NYC-distinctive clade of the virus (20C), as well as host responses in interferon, ACE, hematological, and olfaction pathways. In addition, we use 50,821 patient records to find that renin-angiotensin-aldosterone system inhibitors have a protective effect for severe COVID-19 outcomes, unlike similar drugs. Finally, spatial transcriptomic data from COVID-19 patient autopsy tissues reveal distinct ACE2 expression loci, with macrophage and neutrophil infiltration in the lungs. These findings can inform public health and may help develop and drive SARS-CoV-2 diagnostic, prevention, and treatment strategies.


Subject(s)
COVID-19/genetics , COVID-19/virology , SARS-CoV-2/genetics , Adult , Aged , Angiotensin Receptor Antagonists/pharmacology , Angiotensin-Converting Enzyme Inhibitors/pharmacology , Antiviral Agents/pharmacology , COVID-19/epidemiology , COVID-19 Nucleic Acid Testing , Drug Interactions , Female , Gene Expression Profiling , Genome, Viral , HLA Antigens/genetics , Host Microbial Interactions/drug effects , Host Microbial Interactions/genetics , Humans , Male , Middle Aged , Molecular Diagnostic Techniques , New York City/epidemiology , Nucleic Acid Amplification Techniques , Pandemics , RNA-Seq , SARS-CoV-2/classification , SARS-CoV-2/drug effects , COVID-19 Drug Treatment
7.
bioRxiv ; 2020 May 01.
Article in English | MEDLINE | ID: covidwho-823190

ABSTRACT

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has caused thousands of deaths worldwide, including >18,000 in New York City (NYC) alone. The sudden emergence of this pandemic has highlighted a pressing clinical need for rapid, scalable diagnostics that can detect infection, interrogate strain evolution, and identify novel patient biomarkers. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs, plus a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, bacterial, and viral profiling. We applied both technologies across 857 SARS-CoV-2 clinical specimens and 86 NYC subway samples, providing a broad molecular portrait of the COVID-19 NYC outbreak. Our results define new features of SARS-CoV-2 evolution, nominate a novel, NYC-enriched viral subclade, reveal specific host responses in interferon, ACE, hematological, and olfaction pathways, and examine risks associated with use of ACE inhibitors and angiotensin receptor blockers. Together, these findings have immediate applications to SARS-CoV-2 diagnostics, public health, and new therapeutic targets.

SELECTION OF CITATIONS
SEARCH DETAIL